gh-144995: Optimize memoryview == memoryview#144996
gh-144995: Optimize memoryview == memoryview#144996vstinner wants to merge 8 commits intopython:mainfrom
Conversation
|
Results of the benchmark from the issue: memoryview comparison complexity is no longer O(n) but O(1): values are no longer compared. |
Objects/memoryobject.c
Outdated
| } | ||
|
|
||
| static int | ||
| is_float_format(const char *format) |
There was a problem hiding this comment.
Does this cover the complex types?
import numpy as np
a = np.array([1+2j, 3+4j, float('nan')], dtype=np.complex128)
mv = memoryview(a)
mv == mv # False
There was a problem hiding this comment.
This memory format is Zd. Oh, my change doesn't work for this memoryview. I should replace the blocklist with an allowlist. I'm not a memoryview/buffer expert. I didn't know that 3rd party projects can have their own format.
|
@eendebakpt: I updated the PR to allow formats known to be safe for pointer comparison (integer types), instead of blocking formats known to use floats. I excluded the format |
I think adding the |
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
| # A memoryview is equal to itself: there is no need to compare | ||
| # individual values. This is not true for float values since they can | ||
| # be NaN, and NaN is not equal to itself. | ||
| for int_format in 'bBhHiIlLqQ': |
There was a problem hiding this comment.
Can "?" be tested? Can format starting with "@" be tested? Can the null format be tested?
There was a problem hiding this comment.
I don't know how to test these formats. array.array doesn't support "P" and "?" formats and it doesn't support "@" byte order. Do you have an idea how to test these cases?
There was a problem hiding this comment.
memoryview.cast() supports them.
There was a problem hiding this comment.
Surprisingly:
>>> memoryview(b'\0\1').cast('?') == memoryview(b'\0\2').cast('?')
False
even if
>>> list(memoryview(b'\0\1').cast('?')) == list(memoryview(b'\0\2').cast('?'))
True
But this may be platform depending, so I would not test values different than 0 and 1. Or 1 is also not safe?
It may be undefined behavior to interpret random values except 0 as void* (even if it works on x86). Maybe there is a way to create an array of pointers in ctypes? Or it is not worth to bother?
* Optimize also "P" format * Test also "m != m" * Handle native formats such as "@b"
|
I updated the PR to address @serhiy-storchaka's review:
|
|
I added tests on 4 more formats: |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
I have some doubts about the 'P' test. It may be an operation with undefined behavior (although CPython may be never run on platforms were this does not work, but I am not sure). It would be safer to omit that test. There are no other tests for 'P' format. But the optimization should work for it (if we exclude undefined behavior).
|
I modified the tests to check that the result with optimization is the same as the result without optimization: def check_equal(view, is_equal):
self.assertEqual(view == view, is_equal)
self.assertEqual(view != view, not is_equal)
# Comparison with a different memoryview doesn't use
# the optimization and should give the same result.
view2 = memoryview(view)
self.assertEqual(view2 == view, is_equal)
self.assertEqual(view2 != view2, not is_equal)
For boolean ( If you are not confident with my |
Uh oh!
There was an error while loading. Please reload this page.